Automatic Contextual Text Correction using the Linguistic habits Graph Lhg
نویسندگان
چکیده
Automatyczna korekta tekstów stanowi ważny problem z punktu widzenia dzisiejszych procesorów i edytorów tekstów. W tym artykule został przedstawiony innowacyjny algorytm służący do automatyzacji kontekstowej korekty tekstów z wykorzystaniem Grafu Przyzwyczajeń Lingwistycznych (LHG), który również opisano w tym artykule. W tym celu zbudowano specjalistycznego pająka internetowego przeszukującego strony internetowe celem skonstruowania Grafu Przyzwyczajeń Lingwistycznych (LHG) na podstawie analizy korpusów tekstów uzyskanych z polskojęzycznych stron internetowych. Otrzymane wyniki korekty tekstu z wykorzystaniem tego algorytmu, bazującego na grafie LHG, zostały porównane z komercyjnymi programami do korekty tekstu takimi jak Microsoft Word 2007, Open Office Writer 3.0 oraz z wyszukiwarką Google. Otrzymane wyniki korekty tekstów okazały się być znacznie lepsze niż w wyżej wymienionych komercyjnych narzędziach.
منابع مشابه
The Impact of Contextual Clue Selection on Inference
Linguistic information can be conveyed in the form of speech and written text, but it is the content of the message that is ultimately essential for higher-level processes in language comprehension, such as making inferences and associations between text information and knowledge about the world. Linguistically, inference is the shovel that allows receivers to dig meaning out from the text with...
متن کاملEmotion Detection in Persian Text; A Machine Learning Model
This study aimed to develop a computational model for recognition of emotion in Persian text as a supervised machine learning problem. We considered Pluthchik emotion model as supervised learning criteria and Support Vector Machine (SVM) as baseline classifier. We also used NRC lexicon and contextual features as training data and components of the model. One hundred selected texts including pol...
متن کاملCipher text only attack on speech time scrambling systems using correction of audio spectrogram
Recently permutation multimedia ciphers were broken in a chosen-plaintext scenario. That attack models a very resourceful adversary which may not always be the case. To show insecurity of these ciphers, we present a cipher-text only attack on speech permutation ciphers. We show inherent redundancies of speech can pave the path for a successful cipher-text only attack. To that end, regularities ...
متن کاملTowards Contextual Healthiness Classification of Food Items - A Linguistic Approach
We explore the feasibility of contextual healthiness classification of food items. We present a detailed analysis of the linguistic phenomena that need to be taken into consideration for this task based on a specially annotated corpus extracted from web forum entries. For automatic classification, we compare a supervised classifier and rule-based classification. Beyond linguistically motivated ...
متن کاملMulti-level post-processing for Korean character recognition using morphological analysis and linguistic evaluation
Most of the post-processing methods for character recognition rely on contextual information of character and word-fragment levels. However, due to linguistic characteristics of Korean, such low-level information alone is not sufficient for high-quality character-recognition applications, and we need much higher-level contextual information to improve the recognition results. This paper present...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computer Science (AGH)
دوره 10 شماره
صفحات -
تاریخ انتشار 2009